Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

نویسندگان

  • Rudolf Kadlec
  • Martin Schmid
  • Jan Kleindienst
چکیده

This paper presents results of our experiments using the Ubuntu Dialog Corpus – the largest publicly available multi-turn dialog corpus. First, we use an in-house implementation of previously reported models to do an independent evaluation using the same data. Second, we evaluate the performances of various LSTMs, Bi-LSTMs and CNNs on the dataset. Third, we create an ensemble by averaging predictions of multiple models. The ensemble further improves the performance and it achieves a state-of-the-art result for this dataset. Finally, we discuss our future plans using this corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus

Ubuntu dialogue corpus is the largest public available dialogue corpus to make it feasible to build end-to-end deep neural network models directly from the conversation data. One challenge of Ubuntu dialogue corpus is the large number of out-of-vocabulary words. In this paper we proposed a method which combines the general pre-trained word embedding vectors with those generated on the taskspeci...

متن کامل

Content-Learning Correlations in Spoken Tutoring Dialogs at Word, Turn, and Discourse Levels

We study correlations between dialog content and learning in a corpus of human-computer tutoring dialogs. Using an online encyclopedia, we first extract domainspecific concepts discussed in our dialogs. We then extend previously studied shallow dialog metrics by incorporating content at three levels of granularity (word, turn and discourse) and also by distinguishing between students’ spoken an...

متن کامل

Deep Neural Network Approach for the Dialog State Tracking Challenge

While belief tracking is known to be important in allowing statistical dialog systems to manage dialogs in a highly robust manner, until recently little attention has been given to analysing the behaviour of belief tracking techniques. The Dialogue State Tracking Challenge has allowed for such an analysis, comparing multiple belief tracking approaches on a shared task. Recent success in using d...

متن کامل

Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus

In this paper, we analyze neural network-based dialogue systems trained in an end-to-end manner using an updated version of the recent Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words1. This dataset is interesting because of its size, long context lengths, and technical nature; thus, it can be use...

متن کامل

Building a Corpus of Phrases Related to Learning for Sentiment Analysis

Learning-centered emotions unlike basic emotions emerge during deep learning activities and they have an important relation to cognitive processes of students. In this paper we present the creation process of a corpus of phrases (opinions) related to learning computer programming. Opinions (textual phrases), are categorized in different emotions related to learning such as frustrated, bored, ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1510.03753  شماره 

صفحات  -

تاریخ انتشار 2015